• BRETON Arthur
  • GAMBOA Vinchi
  • DORANGE Romain
  • HU Clement
  • LONGO Giuliano
  • NATH VARMA Vitten

Objectives

Can we issue a buy/sell recommendation of cryptostock on a 7days holding period?

What is the relevant Data for our model ?

  • Price
  • Volume
  • Correlation
  • Social Media
  • Google Trends
  • Utility Indicator

Data Preparation

We used an external Python script to scrape the website www.coinmarketcap.com

Besides technical analysis on the stock, we also wanted to include a trend factor with our data so we looked at Google Trend.

Our hypothesis was that there is a strong correlation between google searches and stock prices.

Feature Overview

  • Volume
  • Momemtum
  • Volatility
  • Trend
  • Buy / Sell classifier

Downloading Data

After scrapping coinmarketcap website, we had the following data:

We also managed to download google data using a specific R library.

scrapGTrendsForKeywords(c("BTC","ETH","XRP","EOS","LTC"), "gtrends.csv")

Atht he moment we are unable to sort our currency by market cap.

Cleanup data

  • Remove unused IDs
  • Format Dates
  • Interpolate missing data
    • Locate missing dates, insert row, and interpolate values

Visualize initial Data

plotGTrends(google.trends)

Feature Engineering

We build these categorical variables on different lag periods:

  • 7 days
  • 14 days
  • 21 days

Create Overall Market statistics

The initial part of the analysis is to be able to create a new dataframe that will contain the total history of the market and useful indicators for technical analysis.

  • Total market Cap daily
  • Volatility 7d, 30d, 90d
  • returns + logreturns
  • Volume

##  [1]  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23
## [24] 24 25 26 27
name type slug mcap beta
BTC coin bitcoin 110701963288 0.9773526
ETH coin ethereum 49547821373 0.9059106
XRP coin ripple 20953892318 0.9308553
BCH coin bitcoin-cash 14724786964 1.0251973
EOS token eos 9552774356 1.2269009
LTC coin litecoin 5498122696 0.9448140
XLM coin stellar 4303693082 0.9678134
ADA coin cardano 4225775445 1.4754460
MIOTA coin iota 3291464170 1.0917421
TRX token tron 2829634354 1.4993969
USDT token tether 2618194620 -0.0132857
NEO coin neo 2508226500 0.8396604
DASH coin dash 2094727525 0.6616042
XMR coin monero 1989915245 0.9092103
XEM coin nem 1750283999 1.0847094
BNB token binance-coin 1732275792 1.1453165
VEN token vechain 1571361432 1.1307183
ETC coin ethereum-classic 1431992689 0.9812628
QTUM coin qtum 948949715 0.8378323
OMG token omisego 932475042 0.9637187
ONT token ontology 907303711 0.6437450
ZEC coin zcash 798735715 0.8147235
BCN coin bytecoin-bcn 774143986 0.7442515
ICX token icon 763252332 1.0782522
LSK coin lisk 703390829 0.8420246
ZIL token zilliqa 661689897 1.2276519
DCR coin decred 650912867 0.6353398

Volume

Represent the average volume for a period:

  • volume.7d
  • volume.14d
  • volume.21d

Volatility

Standard deviation of the returns for a period:

  • volatility.7d
  • volatility.14d
  • volatility.21d

Momemtum

Score representing the delta of Upwards VS Downwards returns.

  • momentum.7d
  • momentum.14d
  • momemtum.21d

Buy / Sell Classifier

Starting from a specific day, we look back at a period returns to determine if we should have issued a buy or sell. This will allow us to test our model.

  • buy.7d
  • buy.14d
  • buy.21d

Putting it together

We are able to define the state of the market by mixing different features together. For example, we can determine how bullish/bearish a market is by multiplying our trend coefficient with the momentum.

btcValues = coinDataEngineering("BTC")
## [BTC] Feature engineering on full dataset for
##  [BTC] SMA20 [Done]
##  [BTC] Volatility [Done]
##  [BTC] Volume [Done]
##  [BTC] Momemtum [Done]
##  [BTC] BuyResult [Done]
##  [BTC] GoogleTrends [Done]
ethValues = coinDataEngineering("ETH")
## [ETH] Feature engineering on full dataset for
##  [ETH] SMA20 [Done]
##  [ETH] Volatility [Done]
##  [ETH] Volume [Done]
##  [ETH] Momemtum [Done]
##  [ETH] BuyResult [Done]
##  [ETH] GoogleTrends [Done]
xrpValues = coinDataEngineering("XRP")
## [XRP] Feature engineering on full dataset for
##  [XRP] SMA20 [Done]
##  [XRP] Volatility [Done]
##  [XRP] Volume [Done]
##  [XRP] Momemtum [Done]
##  [XRP] BuyResult [Done]
##  [XRP] GoogleTrends [Done]
ltcValues = coinDataEngineering("LTC")
## [LTC] Feature engineering on full dataset for
##  [LTC] SMA20 [Done]
##  [LTC] Volatility [Done]
##  [LTC] Volume [Done]
##  [LTC] Momemtum [Done]
##  [LTC] BuyResult [Done]
##  [LTC] GoogleTrends [Done]
eosValues = coinDataEngineering("EOS")
## [EOS] Feature engineering on full dataset for
##  [EOS] SMA20 [Done]
##  [EOS] Volatility [Done]
##  [EOS] Volume [Done]
##  [EOS] Momemtum [Done]
##  [EOS] BuyResult [Done]
##  [EOS] GoogleTrends [Done]

Techincal Analysis

Logistic Regression

buy.7 ~ volume.7 + volume.14 + volume.21 + volatility.7 + volatility.14 + volatility.21 + momentum.7 + momentum.14 + momentum.21 + gtrend.7 + gtrend.14 + gtrend.21

model = buy.7 ~ volume.7 + volume.14 + volume.21 + volatility.7 + volatility.14 + volatility.21 + momentum.7 +  momentum.14 + momentum.21 + gtrend.7 + gtrend.14 + gtrend.21
  # Train our model first
  logistic_reg = glm(model, data=training, family="binomial"(link="logit"))
  
btcResults = doLogisticReg(btcValues)
ethResults = doLogisticReg(ethValues)
xrpResults = doLogisticReg(xrpValues)
ltcResults = doLogisticReg(ltcValues)
eosResults = doLogisticReg(eosValues)

Bitcoin

Ethereum

Ripple

EOS

Litecoin

Summary Results

Name Accuracy Sensitivities Specificities AUC
bitcoin 0.561538461538462 0.6 0.523076923076923 0.5760947
ethereum 0.449612403100775 0.5 0.39344262295082 0.4975892
ripple 0.705426356589147 0.734177215189873 0.66 0.7253165
litecoin 0.565891472868217 0.602564102564103 0.509803921568627 0.5892408
eos 0.75 0.6 0.9 0.7700000

Going Further

We have built a solid base to complete a better analysis in the future. Here is a list of topics we can investigate building on our current status:

  • Use probability to have categorized recommendation (Strong buy, Strong sell, neutral, …)
  • Portfolio Management / Optimization
    • Instead of choosing top 5 by market cap, analysis can be updated daily/hourly on all top currencies
    • Portfolio rebalancing
  • Identitfy arbitrage opportunities
  • Dynamic Horizons (already ready for 7,14,21 days)